Crate fixedstr

source ·
Expand description

Library for several alternative string types using const generics.

  • The size of some types such as str8 and zstr<8> are 8 bytes, compared to 16 bytes for &str on 64bit systems, providing more efficient ways of representing small strings.
  • Most types (except the optional Flexstr and Sharedstr) can be copied and stack-allocated.
  • #![no_std] is supported by all but the optional fstr type. Features that use the alloc crate can also be optionally excluded.
  • Unicode is supported by all but the optional cstr type.
  • Serde serialization is supported by all but the optional Sharedstr type.

COMPATIBILITY NOTICES:

With Version 0.5.0, the default availability of some string types have changed. The default configuration is minimalized. The std, flex-str and shared-str options are no longer enabled by default. The crate now supports #![no_std] by default. The std option only enables the fstr type, which prints warnings to stderr. However, unless you require one of the types fstr, Flexstr or Sharedstr, your build configurations most likely will work as before: the builds will just be smaller. If default-features=false is already part of your configuration, it should also work as before.

Another change that could potentially affect backwards compatibility is that zstr’s Index<usize> and IndexMut<usize> traits, which allow arbitrary modifications to underlying bytes, is now only available with the optional experimental feature. Previously, they were available as default features.

Other Important Recent Updates:

Version 0.5.1 introduced the new no-alloc option. In addition to support for no_std (for all but the fstr type), this option disables compilation of any features that use the alloc crate. This may make some no_std implementations easier. The default build is no longer minimal (see below).

As of Version 0.4.6, all string types except for fstr support #![no_std].

Starting in Version 0.4.2, the underlying representation of the zero-terminated zstr type no longer allows non-zero bytes after the first zero. In particular, the zstr::from_raw function now enforces this rule.

Starting in Version 0.4.0, warnings about capacity being exceeded are only sent to stderr when using the fstr type. For other types, truncation is done silently. Consider using the try_make function or the core::str::FromStr trait.


CRATE OVERVIEW

The two string types that are always provided by this crate are zstr and tstr. However, tstr is not public by default and should be referenced through the type aliases str4, str8, str16, … str256.

  • A zstr<N> is represented by a [u8;N] array underneath and can hold zero-terminated, utf-8 strings of up to N-1 bytes. Furthermore, no non-zero bytes can follow the first zero. This allows the length of a zstr<N> string to be found in O(log N) time.

  • The types str4 through str256 are aliases for internal types tstr<4> through tstr<256> respectively. These strings are stored in [u8;N] arrays with the first byte holding the length of the string. Each tstr<N> can store strings of up to N-1 bytes, with maximum N=256. Because Rust does not currently provide a way to specify conditions (or type casts) on const generics at compile time, the tstr type is not public by default and can only be used through the aliases. The pub-tstr option makes the tstr type public but is not recommended: any tstr<N> with N>256 is not valid and will result in erroneous behavior.

In addition, the following string types are available as options:

  • A fstr<N> stores a string of up to N bytes. It’s represented by a [u8;N] array and a separate usize variable holding the length. This type is enabled with either the std or fstr option and some functions will print warnings to stderr when capacity is exceeded. This is the only type that does not support no_std, but serde is supported.
  • The type cstr, which is made available with the circular-str option, uses a fixed u8 array that is arranged as a circular queue (aka ring buffer). This allows efficient implementations of pushing/triming characters in front of the string without additional memory allocation. The downside of these strings is that the underlying representation can be non-contiguous as it allows wrap-around. As a result, there is no efficient way to implement Deref<str>. Additionally, cstr is the only string type of the crate that does not support Unicode. Only single-byte characters are currently supported. There is, however, an iterator over all characters and most common traits are implemented. Serde and no-std are both supported.
  • The Flexstr<N> type becomes available with the flex-str option. This type uses an internal enum that is either a tstr<N> or an owned String (alloc::string::String) in case the length of the string exceeds N-1. This type is designed for situations where strings only occasionally exceed the limit of N-1 bytes. This type does not implement the Copy trait. Serde and no_std are supported.
  • The Sharedstr<N> type becomes available with the shared-str option. This type is similar to a Flexstr<N> but uses a Rc<RefCell<..>> underneath to allow strings to be shared as well as mutated. This type does not implement Copy but Clone is done in constant time. no_std is supported but not serde.

SUMMARY OF OPTIONAL FEATURES

  • serde : Serialization was initially contributed by wallefan and adopted to other types (except Sharedstr). This feature enables the Serialize/Deserialize traits.
  • circular-str: this feature makes available the cstr type.
  • flex-str: this feature makes available the Flexstr type.
  • shared-str: this feature makes available the Sharedstr type.
  • std: this feature cancels no_std by enabling the fstr type. An alias for this feature name is ‘fstr’.
  • pub-tstr: this feature will make the tstr type public. It is not recommended: use instead the type aliases str4 - str256, which are always available.
  • no-alloc: this anti-feature disables any features that requires the alloc (or std) crate. It will disable entirely the fstr, Flexstr and Sharedstr types: using no-alloc together with flex-str, for example, will not enable the Flexstr type. It also disables the features in tstr, zstr and cstr that require the alloc crate, in particular any use of alloc::string::String. Using this feature is stronger than no_std. Note that when compiled with the all-features option, this feature will be included, which will exclude other features.
  • experimental: the meaning of this feature may change. Currently it implements custom Indexing traits for the zstr type, including IndexMut<usize>, which allows individual bytes to be changed arbitrarily. Experimental features are not part of the documentation.

None of these features is provided by default, so specifying default-features=false has no effect.

SAMPLE BUILD CONFIGURATIONS

The simplest way to install this create is to cargo add fixedstr in your crate or add fixedstr = "0.5" to your dependencies in Cargo.toml. The default build makes available the zstr type and the type aliases str4 - str256 for tstr. Serde is not available with this build but no_std is supported, substituting some std features with those from the alloc crate.

For the smallest possible build, do cargo add fixedstr --features no-alloc in your crate or add the following in Cargo.toml.

  [dependencies]
  fixedstr = {version="0.5", features=["no-alloc"]}

To further enable serde serialization, add the following instead:

  [dependencies]
  fixedstr = {version="0.5", features=["serde","no-alloc"]}

and to exclude cstr but include all other features (except no-alloc):

  [dependencies]
  fixedstr = {version="0.5", features=["std","flex-str","shared-str","serde","pub-tstr","experimental"]}

Do not install this crate with the --all-features option unless you understand that it would include no-alloc, which will disable several types and other features of the crate.

§Examples

 use fixedstr::*;
 let a = str8::from("abcdefg"); //creates new string from &str
 let a1 = a; // copied, not moved
 let a2:&str = a.to_str();
 let a3:String = a.to_string();
 assert_eq!(a.nth_ascii(2), 'c');
 let ab = a.substr(1,5);  // copies substring to new str8
 assert_eq!(ab,"bcde");  // can compare with &str
 assert_eq!(&a[1..4],"bcd"); // implements Index
 assert!(a<ab);  // implements Ord (and Hash, Debug, Display, other traits)
 let mut u:zstr<8> = zstr::from("aλb"); //unicode support
 {assert_eq!(u.nth(1).unwrap(),'λ');} // nth returns Option<char>
 assert!(u.set(1,'μ'));  // changes a character of the same character class
 assert!(!u.set(1,'c')); // .set returns false on failure
 assert!(u.set(2,'c'));
 assert_eq!(u, "aμc");
 assert_eq!(u.len(),4);  // length in bytes
 assert_eq!(u.charlen(),3);  // length in chars
 let mut ac:str16 = a.resize(); // copies to larger capacity string
 let remainder:&str = ac.push("hijklmnopqrst");  //appends string, returns left over
 assert_eq!(ac.len(),15);
 assert_eq!(remainder, "pqrst");
 ac.truncate(10); // shortens string in place
 assert_eq!(&ac,"abcdefghij");
 let (upper,lower) = (str8::make("ABC"), str8::make("abc"));
 assert_eq!(upper, lower.to_ascii_upper()); // no owned String needed
  
 let c1 = str8::from("abcdef"); // string concatenation with + for strN types  
 let c2 = str8::from("xyz123");
 let c3 = c1 + c2;       
 assert_eq!(c3,"abcdefxyz123");   
 assert_eq!(c3.capacity(),15);  // type of c3 is str16

 let c4 = str_format!(str16,"abc {}{}{}",1,2,3); // impls core::fmt::Write
 assert_eq!(c4,"abc 123");  // str_format! truncates if capacity exceeded
 let c5 = try_format!(str8,"abcdef{}","ghijklmn");
 assert!(c5.is_none());  // try_format! returns None if capacity exceeded

 #[cfg(feature = "shared-str")]
 #[cfg(not(feature = "no-alloc"))]
 {
   let mut s:Sharedstr<8> = Sharedstr::from("abcd");
   let mut s2 = s.clone(); // O(1) cost
   s.push_char('e');
   s2.set(0,'A');
   assert_eq!(s2, "Abcde");
   assert!(s==s2 && s.ptr_eq(&s2));
 }

 #[cfg(feature = "experimental")]
 {
   let mut s = <zstr<8>>::from("abcd");
   s[0] = b'A';       // implements IndexMut<usize> (only for zstr)
   assert_eq!(&s[0..3],"Abc");
 }

Macros§

  • Version of to_fixedstr! that returns None instead of truncating .
  • creates a formated string of given type (by implementing core::fmt::Write):
  • Macro for converting any expression that implements the Display trait into the specified type, similar to to_string but without necessary heap allocation. Truncation is automatic and silent. Example:
  • version of str_format! that returns an Option of the given type.

Structs§

  • character interator, returned by cstr::chars (available with circular-str option along with cstr)
  • This type is only available with the flex-str option. A Flexstr<N> is represented internally as a tstr<N> if the length of the string is less than N bytes, and by an owned String otherwise. The structure satisfies the following axiom:
  • This type is only available with the ‘shared-str’ option. A Sharedstr uses Rc and RefCell underneath to allow pointers to a crate::Flexstr to be shared, and thus cloning is always done in constant time. Similar to Flexstr, a Sharedstr<N> is represented either by a tstr<N> or by an owned string if its length is greater than N-1, for N up to 256.
  • This type is only available with the circular-str option. A circular string is represented underneath by a fixed-size u8 array arranged as a circular queue. The string can wrap around either end and thus become internally non-contiguous. This allows for efficient implementations of operations such as push, trim in front of the string. However, Deref<str> is not implemented as it cannot be done efficiently. Instead, the cstr::to_strs function returns a pair of string slices, the second of which is non-empty if the string is not contiguous. Additionally, only single-byte characters are currently allowed, although this might change in the future by using a “ghost vector” at the end of the array. An iterator cstr::chars is provided over all single-byte chars, which also forms the foundation of other traits such as Eq, Ord, Hash, etc. The Serialization (serde) and no-std options are both supported.
  • This type is only available with the std (or fstr) feature. A fstr<N> is a string of up to const N bytes, using a separate variable to store the length. This type is not as memory-efficient as some other types such as str4-str256. This is also the only type of the crate that does not support no_std.
  • This structure is normally only accessible through the public types str4 through str256. These types alias internal types tstr<4> through tstr<256> respectively. The purpose here is to guarantee that the maximum size of the structure does not exceed 256 bytes for it uses the first byte of a u8 array to hold the length of the string. The tstr type can be made directly public with the pub-tstr option.
  • zstr<N>: utf-8 strings of size up to N bytes. The strings are zero-terminated with a single byte, with the additional requirement that all bytes following the first zero are also zeros in the underlying array. This allows for an O(log N) zstr::len function. Note that utf8 encodings of unicode characters allow single null bytes to be distinguished as end-of-string.

Type Aliases§